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Abstract: Student evaluations provide rich information about teaching 
performance, but a number of factors beyond teacher effectiveness influence 
student evaluations. In this study we examined the effects of professor gender and 
perceived age on ratings of effectiveness and rapport as well as academic 
performance. We also asked students to rate professor attractiveness as a 
potential explanation for group differences. Participants (N = 308) saw a picture 
of either a young or old male or female professor while listening to an audio 
lecture. Students reported greater perceived rapport and attractiveness with the 
female relative to the male professors and for younger versus older professors. 
However, studen ts reported the male professors as more effective than the female 
professors. An interaction revealed that among female professors only, younger 
women were rated as more attractive than comparison conditions. Thus, age and 
gender bias likely impact student evaluations of teaching. Our study also revealed 
higher quiz grades in the older-female condition. 

Keywords: student evaluations, professor age, professor gender, rapport, student 
grades 

Student evaluations help professors consider changes such as teaching style, course 
content, and classroom policies in an effort to help students learn and retain information. The 
opportunity for reflection is perhaps the most powerful benefit of evaluations. At many colleges, 
evaluations also are used to determine raises, promotions, tenure, and even teaching awards. In 
support of widespread use, data indicate that teaching evaluations can be good indicators of 
teaching effectiveness (e.g., Marsh & Roche, 1997; Remedios & Lieberman, 2008). 

However, numerous factors beyond teacher effectiveness influence student evaluations. 
For example, variables such as a student’s prior interest in the subject matter, expected grade in 
the course, and reason for taking the course reside within the student and are outside of an 
instructor’s control. In a review of the literature, Marsh and Roche (1997) found that these and 
other factors affected student evaluations of teachers, and although some effect sizes were small, 
variability in evaluations is troublesome. Many contextual variables that impact student 
evaluations also are, for the most part, outside of the instructor’s control. For example, class size, 
classroom setup, availability of technology, distractions outside the classroom, and larger 
contextual issues such as natural disasters, war, or terrorist activity (e.g., 9-11) might influence 
student evaluations. Instructors must contend with variability in teaching evaluations based on 
student and contextual variables, recognizing that negative emotions may become associated 
with the course and teacher. 

Unfortunately, student evaluations may also vary based on characteristics of instructors 
themselves, such as their gender, age, and attractiveness. For the most part, these variables are 
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unchangeable and yet create bias in evaluations. Student perceptions likely tap into cultural 
norms related to gender, age, and attractiveness. For example, women are expected to be kind 
(Ebert, Steffens, & Kroth, 2014), agreeable, open, and contentious (Lockenhoff, 2014), and both 
men and women are expected to be competent (Lockenhoff, 2014). In the teaching realm, 
Kierstead, D’Agostino, and Dill (1988) found that whereas male professors earned high student 
evaluations if they demonstrated competence, female professors had to demonstrate both 
competence and warmth to obtain the same high ratings. Similarly, youth is associated with more 
liking (Danziger & Welfel, 2000). In the classroom, Horner, Murray, and Rushton (1989) found 
a negative correlation between instructor age and ratings of teaching effectiveness (r = -.33). 
Likewise, Wilson, Beyer, and Monteiro (2014) used images of young and old adults and found 
that older professors received more negative ratings on perceptions of friendliness and rapport 
than younger professors. An interaction between gender and age showed that students rated an 
older woman as less organized than a younger woman, although male professors were rated the 
same regardless of age. 

Perhaps tied to age, attractiveness is tied to positive perceptions. Allport (1954) asserted 
that society wants to believe in a just world. Individuals want to believe that people get what they 
deserve. Relatedly, the belief is that physically beautiful people have somehow earned beauty by 
embodying good qualities (i.e., “what is beautiful is good;” Dion, Berscheid, & Walster, 1972). 
Specifically in research on the scholarship of teaching and learning, more attractiveness has been 
linked to higher overall student ratings (e.g., Riniolo, Johnson, Sherman, & Misso, 2006), higher 
ratings of teacher effectiveness (Felton, Mitchell, & Stinson, 2004), better course quality (Felton, 
Koper, Mitchell, & Stinson, 2008), higher grades (Gurung & Vespia, 2007), and perceptions of 
being more approachable, likeable, and providing a more enjoyable class (Gurung & Vespia, 
2007). In a large-scale study of online ratings of professors (N = 2281), Freng and Webber 
(2009) found that even after controlling for a number of other factors, professor attractiveness 
still accounted for 8% of the variance in student ratings. Hamennesh and Parker (2005) 
suggested that more positive ratings may relate with more learning if students focus more 
intently on attractive professors, remaining more engaged and earning higher grades. 

As the above studies indicate, student perceptions are influenced by instructor gender, 
age, and attractiveness. However, the bulk of research asks students to evaluate professors with 
whom they have interacted. The richness of social interactions brings nuisance variability 
associated with a number of variables, including instructor experience, skill, and wannth, that 
potentially covary with less changeable instructor characteristics. For example, positive social 
interactions may actually lead students to provide higher evaluations over time. In fact, even in 
the absence of positive experiences with the instructor, repeated exposure to a stimulus results in 
more positive attitudes toward that stimulus (see Bomstein, 1989, for a review of the Mere 
Exposure Effect; Zajonc, 1968). Further, the complexities of the classroom do not allow isolation 
of variables such as instructor gender, age, and attractiveness. 

In an attempt to remove influences of repeated and varied interactions as well as isolate 
variables, Goebel and Cashen (1979) focused on age and attractiveness. They asked students to 
look at photographs of professors and complete student evaluations. Students expected less 
friendliness and poorer organization from older teachers than younger teachers. Students also 
rated the more attractive professors as friendlier, more encouraging, more organized, less likely 
to give too much work, and better professors overall than the unattractive professors. Thus, 
instructor beauty influences student perceptions and expectations when no additional infonnation 
is provided. When evaluating professors, indeed, what is beautiful is good. 
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Wilson and colleagues (2014) extended research outside of the classroom by showing 
students photographs of younger and older professors. Students rated younger professors as more 
attractive than older professors, revealing a bias toward youth. This societal perception is 
pervasive, with youth typically seen as more attractive (Bargh, Chen, & Burrows, 1996; 
Zebrowitz, Olson, & Hoffman, 1993). Because attractiveness relates with higher teaching 
evaluations, older professors may suffer. 

Of course cultural expectations of attractiveness across the lifespan differ for men and 
women. A youthful appearance among women is particularly valued, with younger women seen 
as more attractive than older women (Sarwer, Grossbart, & Didie, 2003). Although Wilson and 
colleagues (2014) found higher student ratings of attractiveness for younger professors, the effect 
was driven by ratings of female pictures. In an interaction, younger female professors were rated 
as more attractive than older female professors; this effect was not seen for male professors. 
Based on these data, older female professors may experience the most bias in teaching 
evaluations if they are seen as unattractive by a society valuing youth. 

Interestingly, even when beauty cannot be assessed, instructor age and gender influences 
student evaluations. Arbuckle and Williams (2003) showed students a computer-generated 
gender-neutral stick figure presenting a 35-minute lecture in a gender-neutral voice. They asked 
students to identify the stick figure as either male or female and old or young as well as evaluate 
the lecture. Of the four possible professor age/gender groups, when students perceived the stick 
figure as a young male professor, they rated the figure as speaking more enthusiastically and 
using a meaningful tone of voice. 

The available literature suggests that instructor gender and age influence evaluations even 
when students have only a picture or a stick figure on which to base judgments. Researchers 
might argue that such studies have limited external validity. Conversely, classroom studies may 
be influenced by the ongoing, dynamic nature of professor-student interactions, reducing internal 
validity. In the current study, we provided photographs of professors as well as additional 
classroom-related information in the form of a lecture. Students viewed either a male or female 
professor who was either young or old, listened to and completed a quiz on a brief lecture, and 
then rated the professors on attractiveness, effectiveness, and rapport. Our hypotheses were as 
follows: 

1. We expected professor gender to affect student expectations of instructor 
effectiveness, with higher effectiveness ratings for the male professors, regardless of 
age, than female professors. 

2. We expected professor gender and age to affect ratings of the instructor, with higher 
ratings of attractiveness and rapport expected for the younger female professor. 

3. Finally, grades were expected to be higher for the young female instructor based on 
the supposition (Hamermesh & Parker, 2005) that students’ focus is enhanced by 
attractiveness. 


Method 


Participants 

In this study, 340 (127 men and 213 women) students from a southeastern university 
participated. Our sample consisted of 242 white participants, 75 black participants, and 23 
minorities including Asian, Native American, Hispanic, and mixed ethnicities as well as no 


Journal of the Scholarship of Teaching and Learning, Vol. 15, No. 4, August, 2015. 
Josotl.lndiana.edu 


128 



Joye, S. and Wilson, J. 


response to this demographic item. The average age of our sample was 19.88 years old ( SD = 
2.79). The majority of participants were enrolled in an introductory psychology course, although 
students in additional psychology courses were invited to participate if their instructor allowed 
credit. All students signed up to participate using an online participant-management system, after 
which they received a link to an online study hosted by Qualtrics. We received IRB approval 
prior to running the study, and all students were treated ethically. Of the original 340 
participants, 308 completed the entire survey and remained in the data set. 

Materials 

Two pictures of “instructors” were chosen from a web search of publically available 
pictures. Both images were then digitally altered to make each of them appear older (i.e., 
wrinkles and lighter hair color), for a total of four images. We used black-and-white, high- 
quality pictures of the head and neck only. 

The lecture was a three-minute audio file about the history of Bedlam Hospital in 
London. A 16-year-old boy prepared the audio file, and digital alteration increased the frequency 
to create a gender-ambiguous audio that could feasibly represent both ages and genders of 
“instructors.” At the beginning of the Qualtrics survey, students were told that they would hear a 
file that was digitally altered. Thus, the artificial sound of the file should have been explained, 
allowing students to “buy in” to the gender implied by the instructor’s picture they viewed. 
Before the lecture began, students read that they would be quizzed on lecture material. A 10- 
item, true-false quiz assessed learning based on lecture content. 

Perceptions of the professor effectiveness were assessed using measures from both 
Goebel and Cashen (1979) and Wilson and colleagues (2014). These included seven items of 
teacher effectiveness, including the teacher encouraging questions, expecting good work, 
assigning too much work, being organized, explaining concepts, behaving in a friendly manner 
toward students, and overall being a good teacher. For each item, ratings ranged from 1-5, with 1 
representing “Strongly Disagree,” 2 indicating “Disagree,” 3 rating “Neither Agree nor 
Disagree,” 4 representing “Agree,” and 5 indicating “Strongly Agree.” Based on prior research, 
these items were not consolidated but instead served as separate measures of teacher 
effectiveness. 

Students also completed the Brief Professor-Student Rapport Scale. The scale contains 
six items that assess student perceptions of rapport with an instructor. Five-point ratings range 
from Strongly Disagree to Strongly Agree ; two of the items are reverse scored. We altered the 
wording slightly by asking students to rate how they thought the professor in the picture would 
behave rather than indicate how an existing professor did, in fact, behave. For example, “My 
professor makes class enjoyable” was changed to “This instructor would make class enjoyable.” 
The brief version of the scale shows good convergent and discriminant validity with other 
measures of rapport (Ryan & Wilson, 2014) and correlates with numerous positive student 
outcomes, including student motivation, number of classes missed, attitudes toward the professor 
and course, and learning, including end-of-tenn grades (Wilson & Ryan, 2013). We used average 
ratings in data analyses. 

Attractiveness of the pictured person was rated using one item: “How attractive do you 
think this instructor is?” rated on a 7-point Likert scale from Very Unattractive to Very 
Attractive. Students also rated their perception of the teacher’s age in years by answering the 
item: “How old do you think this instructor is?” 
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Procedure 

Participants clicked on a link to the survey and indicated their agreement to continue by 
clicking the bottom of an informed-consent form. Next they saw one of four instructor pictures 
randomized based on the computer program: adult man, older adult man, adult woman, and older 
adult woman. Near the picture was a standard bar for an audio file with an iconic Play button. In 
bold below the box, students learned that the voice they would hear had been digitally altered. 
They were instructed to listen to the entire lecture because they would be quizzed on the material 
immediately following the lecture. 

After listening to the lecture and completing the quiz, students completed the seven 
professor assessments used by Goebel and Cashen (1979) and Wilson and colleagues (2014). 
Next, participants completed the Brief Professor-Student Rapport Scale. The picture remained 
above the scale when completing ratings. On the next page, the picture again appeared at the top, 
and students were asked to indicate perceived age of the instructor and attractiveness. Lastly, 
students provided basic demographic information on themselves. 

Results 


Manipulation Check 

We intended to create the impression of young versus old male and female professors. 
Our manipulation check revealed that we did create the intended perceptions based on a 
significant difference between assume ages of professors, F( 1, 304) = 148.13, p < .001, partial rp 
= .33. Students rated the digitally-altered “older” pictures as older (M= 43.49, SEM = .49, n = 
151) than the unaltered “younger” pictures (M= 35.46, SEM = .49, n = 157). However, we also 
noted that instructor gender influenced assumed age, F(\ , 304) = 17.99, p < .001, partial q 2 = .06, 
with men viewed as older (M= 40.72, SEM= .55, n = 158) than women (M= 38.00, SEM= .61, 
n = 150). Further, instructor gender and aging of pictures interacted to affect assumed age, F( I, 
304) = 9.87, p = .002, partial rp = .03. The aged pictures were perceived to be older than the 
younger pictures for both the male professor (M for old = 43.86, SEM= .69, n = 76; M for young 
= 37.82, SEM= .71, n = 82) and the female professor (Mfor old = 43.12, SEM= .70, n = 75; M 
for young = 32.88, SEM= .53, n = 75). Additionally, the young male professor was seen as older 
than the young female professor. Pictures of the older male and female instructors did not differ 
(/? > .05). 

Collapsing Data 

Although our manipulation check revealed that images altered to communicate an older 
professor did cause students to estimate a higher age relative to the younger images, the 
perceived ages differed by only 2.84 years based on variability within conditions. Further, the 
male instructor was seen as older than the female instructor, particularly in pictures intended to 
communicate youth, even though the pictures were intended to create perceptions of similar age. 
Because student perceptions of instructor age were our primary concern, and perceived age 
relates to student ratings of teaching effectiveness (Arbuckle & Williams, 2003), we divided 
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participants into two groups based on participants’ perceptions of the instructor as up to 40 years 
old or above 40 (median-split procedure using perceived age across the entire data set). 

Of the group in which the man was intended to look young, 49 participants assumed he 
was, in fact, at or below 40 years old; however, 33 students perceived him as over 40. In the 
male picture intended to look older, 66 students assessed him as over 40, and only 10 assumed he 
was 40 years old or less. Of female pictures, the younger image revealed a good student match 
with expected and perceived age, with 67 assuming she was 40 or younger; only 8 thought she 
was older than 40. Finally, the picture of the older female was perceived as above 40 by 59 
participants and up to 40 years old by 16 participants. 

The final analysis contained 77 participants perceiving a younger male professor, 81 
perceiving an older male professor, 94 perceiving a younger female professor, and 56 perceiving 
an older female professor. 

Primary Analysis 

We analyzed these data using a 2 (gender of professor) X 2 (perceived age of professor), 
between-groups MANOVA. Dependent variables of interest included seven items related to 
teacher effectiveness as well as the Brief Professor-Student Rapport Scale (Wilson & Ryan, 
2013), the attractiveness rating, and quiz grades in percent correct. 

Omnibus tests allowed us to further examine effects for gender of the instructor pictured, 
perceived age of the instructor, and the interaction (below, all p < .05). Gender of the instructor 
affected scores on student perceptions of the instructor’s ability to explain material well, F(l, 
304) = 3.86, p = .05, partial q 2 = .013, brief rapport F(\, 304) = 12.54,/? = .001, partial q 2 = .043, 
attractiveness, F{ 1, 304) = 88.53,/? < .001, partial rp = .23, and quiz grades, F( T, 304) = 4.30,/? 
= .039, partial rp = .014. Regardless of perceived age, the male instructor was rated as better at 
explaining the material (M= 3.41, SEM= .08) than the female instructor (M= 3.16, SEM= .09). 
However, the female instructor was assumed to foster more rapport (M= 3.42, SEM = .05 than 
the male instructor (M= 3.17, SEM = .04), and she was rated as more attractive (M= 4.74, SEM 
= .10) than the male instructor (M= 3.48, SEM = .09). Finally, students earned higher grades on 
the lecture quiz when they viewed a female professor (M= 59.04, SEM = 1.27) versus viewing a 
male professor (M= 55.43, SEM= 1.19). 

Perceived age of the instructor, regardless of gender, affected student perceptions of 
professor-student rapport, F{ 1, 304) = 13.84,/? = .001, partial q 2 = .044, attractiveness, F{ 1, 304) 
= 19.21./? = .001, partial q 2 = .059, and quiz grades, F(l, 304) = 7.79, p = .006, partial q 2 = .025. 
Students rated younger instructors as more attractive (M = 4.40, SEM = .09) than older 
instructors (M = 3.82, SEM = .10) and assumed more rapport with the younger instructor (M = 
3.23, SEM= .05) than the older instructor (M= 2.93, SEM = .06). Interestingly, students scored 
significantly higher on a quiz of the lecture material if they believe the lecture to come from an 
older professor (M= 59.66, SEM= 1.30) versus a younger one (M= 54.81, SEM= 1.15). 

Attractiveness and quiz-grade outcomes were further explained by an interaction between 
instructor gender and perceived age, F(l, 304) = 11.83,/? = .001, partial q 2 = .037, and F(l, 304) 
= 4.93,/? = .027, partial q 2 = .016, respectively. As seen in Figure 1, across pictures of the female 
professor, the younger picture was rated as more attractive (M= 5.26, SEM = .12) than the older 
picture (M= 4.21, SEM= .15). Ratings of male attractiveness did not vary with age (/? > .05). On 
the quiz outcome, students earned higher scores when they thought they were hearing from an 
older female (M= 63.39, SEM= 2.00) than a younger female (M= 54.68, SEM= 1.55) as well as 
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compared with the older male (M = 55.93, SEM = 1.67). When both the male and female 
instructors were considered to be young, quiz scores did not differ (p > .05; see Figure 2.) 
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Figure 1. Participants rated a younger female instructor (perceived as up to 40 years old) as more 
attractive than an older version of the same instructor. Perceptions of male instructors did not 
differ based on age. Error bars represent SEM. 
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Figure 2. Participants earned better quiz grades on a lecture they perceived to be given by an 
older female instructor than either a younger female instructor or an older male instructor. Error 
bars represent SEM. 


Discussion 

Our first hypothesis was that students would rate the male professor in the current study 
as more effective than the female professor, regardless of age. This hypothesis was supported. Of 
the seven separate items measuring effectiveness, explaining concepts arguably serves as the 
clearest indication of effective teaching, and this item reflected student perceptions in favor of 
male instructors. Our second hypothesis was that whereas the younger female professor would 
earn higher ratings of attractiveness and rapport than the older female professor, this effect 
would not be seen for male professors. This hypothesis was partially supported. Results were as 
expected for attractiveness, but for rapport, there were overall effects for gender and age (with 
younger and female professors being seen as engendering higher rapport), but these variables did 
not interact. Our third hypothesis that the younger female professor would inspire higher grades 
was not supported. In fact, participants who perceived an older female professor scored higher on 
the quiz. 

Students expect male professors to be effective in their work but expect female professors 
to spend time building supportive relationships with students. According to Kierstead, 
D’Agostino, and Dill (1988), male professors earned better student evaluations if they 
demonstrated competence, but female professors had to demonstrate both competence and 
warmth to obtain the same high ratings. Unfortunately, failure to behave as gender roles dictate 
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(according to students) results in greater hostility toward female professors, in particular 
(Sprague & Massoni, 2005) and may also result in poorer evaluations. Professors who behave in 
a way to support expectations are rewarded in student evaluations. 

In our study, perhaps higher rapport ratings for female professors can be explained by 
higher ratings of attractiveness. Goebel and Cashen (1979) found that students perceived more 
attractive professors as friendlier, more encouraging, more organized, less likely to give too 
much work, and better professors overall than unattractive professors. Perhaps it is not women 
who are expected to be warm but attractive people who are expected to be warm. To test this 
potential explanation, we examined the correlations between attractiveness and rapport based on 
the female pictures, r(148) = .24, p = .003, and male pictures, r(156) = .23, p = .003, both of 
which yielded significant relationships. However, the large sample size influenced our ability to 
find significance, and the correlational values were weak to moderate. Certainly, attractiveness 
explains some variability in rapport, but we cannot say with confidence that gender of the 
professor did not further explain student ratings of rapport. 

Likewise, students rated younger professors as more attractive and warmer (higher 
rapport) than older professors. In fact, perceived age correlated negatively with rapport, r(306) = 
-.27, p < .001, and ratings of attractiveness, r(306) = -.31, p < .001. These results are not 
surprising given societal norms. Lucacel and Baban (2014) asked participants in their twenties 
about perceptions on aging and found that the majority of people in their sample held a negative 
perception of old age and the aging process. Similarly, individuals under 35 years are less likely 
to believe that older people can be as effective as younger workers (Abramson & Silverstein, 
2004). Our study suggests that ageist attitudes are not discarded at the classroom door. Students 
expect older professors to be less effective teachers. 

In addition to student perceptions, we directly measured learning with a lecture quiz. We 
expected attractiveness, rapport, and grades to correlate positively with each other based on 
Hamermesh and Parker’s (2005) argument that students focus more on attractive teachers. 
However, we found significantly better recall when students thought the lecturer was an older 
female. This result is particularly surprising because students rated the older female as less 
attractive than the younger female. Students chose to focus most on a lecture provided by a 
female they perceived to be over 40 years old. 

An older female could activate a schema for “mother,” a female likely to expect a strong 
work ethic. Students’ desire to please a mother figure could increase focus during a brief lecture. 
Indeed, based on a significantly higher quiz grade, we must assume more focus. Higher quiz 
averages for those perceiving an older woman may indicate that students work harder for older 
women than younger women or men. This idea is supported by the fact that although perceived 
age and quiz grades correlated significantly in the overall sample, r(306) = .16, p = .006, this 
correlation was driven by female professors, r(148) = .27, p = .001. For male professors, the 
correlation between perceived age and quiz grades failed to reach significance, r(156) = .07, p > 
.05. Based on the positive relationship between quiz grades and perceived age of the female 
professor, activating a schema for “mother” may explain higher grades. 

Taken together, results of the current study reveal the impact of professor gender and age 
on student evaluations of teaching and grades. As instructors, we would benefit from minimizing 
potential negative effects reported here and maximizing benefits associated with professor 
gender and age. For example, Legg and Wilson (2009) found that when students received an 
emailed welcome message one week prior to the first day of class, motivation, attitudes toward 
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the instructor, and retention were enhanced. Although students may be able to guess a 
professor’s gender based on a name, an email can attenuate the impact of age. 

Professors can also improve students’ early impressions on the first day of class. For 
example, professors might have what students consider an “ideal” first day, which includes 
covering the syllabus in a welcoming manner, avoiding homework, and ending class early. 
Although some may argue that ending class early on the first sets a “lazy” tone for the remainder 
of the course, Wilson and Wilson (2008) found that following students’ wishes for the first day 
of class improved both student motivation and end-of-term grades. As another first-day activity, 
welcoming students by shaking hands increased ratings of instructor skill and ability to motivate 
students (Wilson, Stadler, Schwartz, & Goff, 2009). We should caution that this effect occurred 
for female professors only; for male professors, the opposite effect was seen. Certainly, many 
approaches can enhance students’ perceived effectiveness related to female instructors and 
rapport related to male instructors as well as older professors. 

How does the current study inform the use of teaching evaluations for professor tenure, 
promotion, raises, and awards? Traditional measures focus on student perceptions, not student 
performance. As a result, older female professors may be at a disadvantage. The practice of 
rewarding or punishing faculty based on student evaluations may be unfair if biases exist. In the 
case of the older female professor, she may be highly effective at helping students learn even if 
her teaching evaluations are relatively low. 

Potential Limitations and Suggestions for Future Research 

A potential limitation in this study is the fact that most students in this study were in an 
introductory psychology course. It is quite possible that more advanced students work through 
their expectations of others and learn to avoid bias. Unfortunately, a wealth of research suggests 
that bias, whether explicit or implicit, exists within people regardless of their age (e.g., Koch, 
D’Mello, & Sackett, 2015). To assess the external validity of our study, additional groups of 
students beyond those in introductory psychology should be examined. 

Students who reviewed professors in this study merely viewed pictures and heard a brief 
lecture. We recognize that the dynamic nature of a classroom is much more complex, limiting 
our external validity. For example, with limited teacher information, participants may have relied 
heavily on perceptions of physical attributes to make inferences about rapport and effectiveness. 
In the richness of classroom environments, students have additional infonnation based on social 
interactions, perhaps creating a different pattern of results. However, a manipulated empirical 
study allows us to identify experimentally an ongoing bias in student evaluations. Such 
information illustrates the need for caution when depending on student evaluations for faculty 
tenure, promotion, raises, and awards. As long as gender and age bias exists in the minds of 
students, discrimination can occur. 
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